Speaker independent voiced-unvoiced detection evaluated in different speaking styles

نویسندگان

Martin Heckmann

Marco Moebus

Frank Joublin

Christian Goerick

چکیده

We propose a new algorithm for voiced/unvoiced classification of speech on a phoneme or sample level. The algorithm is inspired by auditory based approaches and combines two cues. One cue is based on the energy distribution of the signal and the other on the harmonicity. In order to extract the harmonicity of the signal we calculate a histogram of the zero crossings of the filter channels after applying a Gammatone filterbank to the signal. A measure similar to the variance of the zero crossings yields the harmonicity cue. The performance of the algorithm was measured on several minutes of read and spontaneous speech with various speakers. An algorithm proposed by Mustafa et al. [1] served as benchmark. The results show that our algorithm performs significantly better as well on read as on spontaneous speech and seems in particular be better able to to cope with different speaking styles.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Independent Modelling of High and Low Energy Speech Frames for Spoofing Detection

Spoofing detection systems for automatic speaker verification have moved from only modelling voiced frames to modelling all speech frames. Unvoiced speech has been shown to carry information about spoofing attacks and anti-spoofing systems may further benefit by treating voiced and unvoiced speech differently. In this paper, we separate speech into low and high energy frames and independently m...

متن کامل

Structure-based Speech Classifcation Using Non-linear Embedding Techniques

Usable speech” is referred to as those portions of corrupted speech which can be used in determining a reasonable amount of distinguishing features of the speaker. It has previously been shown that the use of only voiced segments of speech improves the usable speech detection system, and also, that unvoiced speech does not contributes significant information about the speaker(s) for speaker ide...

متن کامل

HMM-based MAP Prediction o Formant Frequencies from N

This paper describes how formant frequencies of voiced and unvoiced speech can be predicted from mel-frequency cepstral coefficients (MFCC) vectors using maximum a posteriori (MAP) estimation within a hidden Markov model (HMM) framework. Gaussian mixture models (GMMs) are used to model the local joint density of MFCCs and formant frequencies. More localised prediction is achieved by modelling s...

متن کامل

Processing of Voiced and Unvoiced Acoustic Stimuli in Musicians

Past research has shown that musical training induces changes in the processing of supra-segmental aspects of speech, such as pitch and prosody. The aim of the present study was to determine whether musical expertise also leads to an altered neurophysiological processing of sub-segmental information available in the speech signal, in particular the voice-onset-time. Using high-density EEG-recor...

متن کامل

Voiced-Unvoiced—Silence Detection Problem

One of the most difficult problems in speech analysis is reliable discrimination among silence, unvoiced speech, and voiced speech which has been txansmitted over a telephone line. Although several methods have been proposed for making this three-level decision, these schemes have met with only modest success. In this paper, a novel approach to the voiced—unvoiced—silence detection problem is p...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2006

Speaker independent voiced-unvoiced detection evaluated in different speaking styles

نویسندگان

چکیده

منابع مشابه

Independent Modelling of High and Low Energy Speech Frames for Spoofing Detection

Structure-based Speech Classifcation Using Non-linear Embedding Techniques

HMM-based MAP Prediction o Formant Frequencies from N

Processing of Voiced and Unvoiced Acoustic Stimuli in Musicians

Voiced-Unvoiced—Silence Detection Problem

عنوان ژورنال:

اشتراک گذاری